Reproducibility

A very brief overview

Research Associate | Webber Group

2024-03-14

Most published research is false

  • Whenever people look at this, things don’t look great…14
  • Citations aren’t a good metric of quality either5
  • Small sample sizes are a big issue, especially in neuroscience6

Academic journals suck

  • Journal status and paper quality are poorly correlated7,8
  • Space is dominated by 5 big publication houses - they are evil9
  • Strong bias against negative results
  • Do they spend their staggering profits checking for obvious signs of fraud, encourage replication or even just make science more readable and accessible? Of course not.10
  • Nice documentary on the business of scholarship here11

GUIs suck

  • Graphical user interface (GUI) tools like Excel, SPSS & Graphpad are very opaque and error prone, as our government learnt during COVID12
  • The Excel mistake heard around the world and the lasting economic repercussions13
  • Propriety software - many people can’t access it and therefore can’t replicate analysis
  • No obvious history of changes made or operations performed

Stats suck

  • Frequentist statistics is used almost exclusively for all science
    • It is extremely unintuitive and prone to abuse and is rarely done correctly in practise (p-hacking)4
    • Great video on why p-values are hella variable14
  • Bayesian statistics is a fundamentally different approach, no ground truth assumptions so no pvalues and no p-hacking

Experimental designs suck

  • Positive control? pretty pls?

An investigator cannot guarantee that the claims made in a study are correct

  • Reproducibility is important not because it ensures that the results are correct, but rather because it ensures transparency and gives us confidence in understanding exactly what was done.

An example of a wee goof in the field

  • That time a major paper that supposedly discovered A\(\beta\)*56 a oligermer species, turned out to be full of image manipulations, whoops15

Solutions

Registered reports

  • “But the big IF journals don’t do registered reports”
  • Yeah, they’re evil remember? Of course they won’t do anything to make science not suck
    • Also, Nature does do them now (they’re still evil though)

An idealised workflow

  1. Come up with neat project idea
  2. pre-register study (can be embargoed) and write up registered report
  3. publish registered report > get valuable feedback from wise and courteous reviewers (let me dream pls)
  4. Apply for funding, pointing to your already peer-reviewed project as evidence of it’s strength > get all the moneys
  5. Do project as you said you would > get results (everything worked first time)

  1. Publish all the things!

Let there be code

  • Code is just a set of instructions to tell a computer to do something
  • It’s an explicit bridge between your raw experimental data (which should be sacrosanct!) and your reported stats/figures
  • Great video guide here16
  • Version control: wouldn’t it be nice to have a detailed record of all the changes made to all the files in a project when and by whom? Use Git and the DRI GitHub!
  • Use whatever, as long as it’s open-source
  • Containerisation: encapsulate your computational environment17

Particualr tool suggestions

  • R or Python for data analysis and use literate programming methods like Quarto and/or Jupyter notebooks1
  • If you want to do Bayesian but still want a GUI: JASP

Data sharing policy

Thanks for listening

Replace profile.png with the path to the pictures you’d like to use.

Name Porsition/Role

Name Porsition/Role

Name Porsition/Role

Collaborator institution
Name (Position/Role)
Name (Position/Role)
Name (Position/Role)

Collaborator institution
Name (Position/Role)
Name (Position/Role)
Name (Position/Role)

References

1.
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016).
2.
Ioannidis, J. P. A. Why most published research findings are false. PLOS Medicine 2, e124 (2005).
3.
Smaldino, P. E. & McElreath, R. The natural selection of bad science. Royal Society Open Science 3, 160384 (2016).
4.
Head, M. L., Holman, L., Lanfear, R., Kahn, A. T. & Jennions, M. D. The extent and consequences of p-hacking in science. PLOS Biology 13, e1002106 (2015).
5.
Yang, Y., Youyou, W. & Uzzi, B. Estimating the deep replicability of scientific findings using human and artificial intelligence. Proceedings of the National Academy of Sciences 117, 10762–10768 (2020).
6.
Button, K. S. et al. Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience 14, 365–376 (2013).
7.
8.
Brembs, B. Prestigious science journals struggle to reach even average reliability. Frontiers in Human Neuroscience 12, (2018).
9.
Racimo, F. et al. Ethical publishing: How do we get there? Philosophy, Theory, and Practice in Biology 14, (2022).
10.
11.
12.
13.
Ryssdal, K. The excel mistake heard round the world. Marketplace (2013).
14.
15.
16.
Çetinkaya-Rundel, M. Improve your workflow for reproducible science. (2020).
17.
Nüst, D. et al. Ten simple rules for writing dockerfiles for reproducible data science. PLOS Computational Biology 16, e1008316 (2020).